MLE-Guided Parameter Search for Task Loss Minimization in Neural Sequence Modeling
نویسندگان
چکیده
Neural autoregressive sequence models are used to generate sequences in a variety of natural language processing (NLP) tasks, where they evaluated according sequence-level task losses. These typically trained with maximum likelihood estimation, which ignores the loss, yet empirically performs well as surrogate objective. Typical approaches directly optimizing loss such policy gradient and minimum risk training based around sampling space obtain candidate update directions that scored on single sequence. In this paper, we develop an alternative method random search parameter leverages access gradient. We propose guided (MGS), samples from distribution over is mixture current parameters gradient, each direction weighted by its improvement loss. MGS shifts space, scores candidates using losses pooled multiple sequences. Our experiments show capable losses, substantial reductions repetition non-termination completion, similar improvements those machine translation.
منابع مشابه
Automatic Parameter Selection for Sequence Similarity Search
We show that simulated annealing search can be used to automatically select parameters and find highly similar data regions using a modified version of the DNA-DNA Sequence Similarity Search program. We call this modified program AutoSimS. We use the average score of high-scoring chains to measure the goodness of the resulting sequence similarity search, and use adaptive simulated annealing to ...
متن کاملmodeling loss data by phase-type distribution
بیمه گران همیشه بابت خسارات بیمه نامه های تحت پوشش خود نگران بوده و روش هایی را جستجو می کنند که بتوانند داده های خسارات گذشته را با هدف اتخاذ یک تصمیم بهینه مدل بندی نمایند. در این پژوهش توزیع های فیزتایپ در مدل بندی داده های خسارات معرفی شده که شامل استنباط آماری مربوطه و استفاده از الگوریتم em در برآورد پارامترهای توزیع است. در پایان امکان استفاده از این توزیع در مدل بندی داده های گروه بندی ...
Task Loss Estimation for Sequence Prediction
Often, the performance on a supervised machine learning task is evaluated with a task loss function that cannot be optimized directly. Examples of such loss functions include the classification error, the edit distance and the BLEU score. A common workaround for this problem is to instead optimize a surrogate loss function, such as for instance cross-entropy or hinge loss. In order for this rem...
متن کاملthe search for the self in becketts theatre: waiting for godot and endgame
this thesis is based upon the works of samuel beckett. one of the greatest writers of contemporary literature. here, i have tried to focus on one of the main themes in becketts works: the search for the real "me" or the real self, which is not only a problem to be solved for beckett man but also for each of us. i have tried to show becketts techniques in approaching this unattainable goal, base...
15 صفحه اولImplicit sequence learning in a search task.
This study investigated the effects of selection demands on implicit sequence learning. Participants in a search condition looked for a target among seven distractors and responded on the target identity. The responses followed a deterministic sequence, and sequence learning was compared to that found in two control conditions in which the targets were presented alone, either at a central locat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i16.17652